{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Serving a TensorFlow Model as a REST Endpoint with TensorFlow Serving and SageMaker\n",
    "\n",
    "We need to understand the application and business context to choose between real-time and batch predictions. Are we trying to optimize for latency or throughput? Does the application require our models to scale automatically throughout the day to handle cyclic traffic requirements? Do we plan to compare models in production through A/B tests?\n",
    "\n",
    "If our application requires low latency, then we should deploy the model as a real-time API to provide super-fast predictions on single prediction requests over HTTPS. We can deploy, scale, and compare our model prediction servers with SageMaker Endpoints.\n",
    "\n",
    "## Interesting Reads\n",
    "* [**How Roblox Scaled BERT to Serve 1+ Billion Daily Requests on CPUs**](https://blog.roblox.com/2020/05/scaled-bert-serve-1-billion-daily-requests-cpus/)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<img src=\"img/sagemaker-architecture.png\" width=\"80%\" align=\"left\">"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import boto3\n",
    "import sagemaker\n",
    "import pandas as pd\n",
    "\n",
    "sess   = sagemaker.Session()\n",
    "bucket = sess.default_bucket()\n",
    "role = sagemaker.get_execution_role()\n",
    "region = boto3.Session().region_name\n",
    "\n",
    "sm = boto3.Session().client(service_name='sagemaker', region_name=region)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "%store -r training_job_name"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[OK]\n"
     ]
    }
   ],
   "source": [
    "try:\n",
    "    training_job_name\n",
    "    print('[OK]')\n",
    "except NameError:\n",
    "    print('+++++++++++++++++++++++++++++++')\n",
    "    print('[ERROR] Please run the notebooks in the previous TRAIN section before you continue.')\n",
    "    print('+++++++++++++++++++++++++++++++')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "tensorflow-training-2020-12-30-07-28-23-054\n"
     ]
    }
   ],
   "source": [
    "print(training_job_name)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Copy the Model to the Notebook"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "download: s3://sagemaker-us-east-1-835319576252/tensorflow-training-2020-12-30-07-28-23-054/output/model.tar.gz to ./model.tar.gz\n"
     ]
    }
   ],
   "source": [
    "!aws s3 cp s3://$bucket/$training_job_name/output/model.tar.gz ./model.tar.gz"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "transformers/\n",
      "transformers/fine-tuned/\n",
      "transformers/fine-tuned/tf_model.h5\n",
      "transformers/fine-tuned/config.json\n",
      "tensorboard/\n",
      "metrics/\n",
      "metrics/confusion_matrix.png\n",
      "tensorflow/\n",
      "tensorflow/saved_model/\n",
      "tensorflow/saved_model/0/\n",
      "tensorflow/saved_model/0/variables/\n",
      "tensorflow/saved_model/0/variables/variables.index\n",
      "tensorflow/saved_model/0/variables/variables.data-00000-of-00001\n",
      "tensorflow/saved_model/0/assets/\n",
      "tensorflow/saved_model/0/saved_model.pb\n"
     ]
    }
   ],
   "source": [
    "!mkdir -p ./model/\n",
    "!tar -xvzf ./model.tar.gz -C ./model/"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "2020-12-30 08:17:15.872153: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer.so.6'; dlerror: libnvinfer.so.6: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-10.0/lib64:/usr/local/cuda-10.0/extras/CUPTI/lib64:/usr/local/cuda-10.0/lib:/usr/local/cuda-10.0/efa/lib:/opt/amazon/efa/lib:/opt/amazon/efa/lib64:/usr/lib64/openmpi/lib/:/usr/local/lib:/usr/lib:/usr/local/mpi/lib:/lib/:/usr/lib64/openmpi/lib/:/usr/local/lib:/usr/lib:/usr/local/mpi/lib:/lib/:/usr/lib64/openmpi/lib/:/usr/local/lib:/usr/lib:/usr/local/mpi/lib:/lib/:\n",
      "2020-12-30 08:17:15.872241: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer_plugin.so.6'; dlerror: libnvinfer_plugin.so.6: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-10.0/lib64:/usr/local/cuda-10.0/extras/CUPTI/lib64:/usr/local/cuda-10.0/lib:/usr/local/cuda-10.0/efa/lib:/opt/amazon/efa/lib:/opt/amazon/efa/lib64:/usr/lib64/openmpi/lib/:/usr/local/lib:/usr/lib:/usr/local/mpi/lib:/lib/:/usr/lib64/openmpi/lib/:/usr/local/lib:/usr/lib:/usr/local/mpi/lib:/lib/:/usr/lib64/openmpi/lib/:/usr/local/lib:/usr/lib:/usr/local/mpi/lib:/lib/:\n",
      "2020-12-30 08:17:15.872253: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:30] Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.\n",
      "\n",
      "MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:\n",
      "\n",
      "signature_def['__saved_model_init_op']:\n",
      "  The given SavedModel SignatureDef contains the following input(s):\n",
      "  The given SavedModel SignatureDef contains the following output(s):\n",
      "    outputs['__saved_model_init_op'] tensor_info:\n",
      "        dtype: DT_INVALID\n",
      "        shape: unknown_rank\n",
      "        name: NoOp\n",
      "  Method name is: \n",
      "\n",
      "signature_def['serving_default']:\n",
      "  The given SavedModel SignatureDef contains the following input(s):\n",
      "    inputs['input_ids'] tensor_info:\n",
      "        dtype: DT_INT32\n",
      "        shape: (-1, 5)\n",
      "        name: serving_default_input_ids:0\n",
      "  The given SavedModel SignatureDef contains the following output(s):\n",
      "    outputs['output_1'] tensor_info:\n",
      "        dtype: DT_FLOAT\n",
      "        shape: (-1, 5)\n",
      "        name: StatefulPartitionedCall:0\n",
      "  Method name is: tensorflow/serving/predict\n",
      "WARNING:tensorflow:From /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/tensorflow_core/python/ops/resource_variable_ops.py:1786: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.\n",
      "Instructions for updating:\n",
      "If using Keras pass *_constraint arguments to layers.\n",
      "\n",
      "Defined Functions:\n",
      "  Function Name: '__call__'\n",
      "    Option #1\n",
      "      Callable with:\n",
      "        Argument #1\n",
      "          DType: dict\n",
      "          Value: {'input_ids': TensorSpec(shape=(None, 5), dtype=tf.int32, name='inputs/input_ids')}\n",
      "        Named Argument #1\n",
      "          DType: str\n",
      "          Value: ['t', 'r', 'a', 'i', 'n', 'i', 'n', 'g']\n",
      "    Option #2\n",
      "      Callable with:\n",
      "        Argument #1\n",
      "          DType: dict\n",
      "          Value: {'input_ids': TensorSpec(shape=(None, 5), dtype=tf.int32, name='input_ids')}\n",
      "        Named Argument #1\n",
      "          DType: str\n",
      "          Value: ['t', 'r', 'a', 'i', 'n', 'i', 'n', 'g']\n",
      "    Option #3\n",
      "      Callable with:\n",
      "        Argument #1\n",
      "          DType: dict\n",
      "          Value: {'input_ids': TensorSpec(shape=(None, 5), dtype=tf.int32, name='input_ids')}\n",
      "        Named Argument #1\n",
      "          DType: str\n",
      "          Value: ['t', 'r', 'a', 'i', 'n', 'i', 'n', 'g']\n",
      "    Option #4\n",
      "      Callable with:\n",
      "        Argument #1\n",
      "          DType: dict\n",
      "          Value: {'input_ids': TensorSpec(shape=(None, 5), dtype=tf.int32, name='inputs/input_ids')}\n",
      "        Named Argument #1\n",
      "          DType: str\n",
      "          Value: ['t', 'r', 'a', 'i', 'n', 'i', 'n', 'g']\n",
      "\n",
      "  Function Name: '_default_save_signature'\n",
      "    Option #1\n",
      "      Callable with:\n",
      "        Argument #1\n",
      "          DType: dict\n",
      "          Value: {'input_ids': TensorSpec(shape=(None, 5), dtype=tf.int32, name='input_ids')}\n",
      "\n",
      "  Function Name: 'call_and_return_all_conditional_losses'\n",
      "    Option #1\n",
      "      Callable with:\n",
      "        Argument #1\n",
      "          DType: dict\n",
      "          Value: {'input_ids': TensorSpec(shape=(None, 5), dtype=tf.int32, name='inputs/input_ids')}\n",
      "        Named Argument #1\n",
      "          DType: str\n",
      "          Value: ['t', 'r', 'a', 'i', 'n', 'i', 'n', 'g']\n",
      "    Option #2\n",
      "      Callable with:\n",
      "        Argument #1\n",
      "          DType: dict\n",
      "          Value: {'input_ids': TensorSpec(shape=(None, 5), dtype=tf.int32, name='input_ids')}\n",
      "        Named Argument #1\n",
      "          DType: str\n",
      "          Value: ['t', 'r', 'a', 'i', 'n', 'i', 'n', 'g']\n",
      "    Option #3\n",
      "      Callable with:\n",
      "        Argument #1\n",
      "          DType: dict\n",
      "          Value: {'input_ids': TensorSpec(shape=(None, 5), dtype=tf.int32, name='inputs/input_ids')}\n",
      "        Named Argument #1\n",
      "          DType: str\n",
      "          Value: ['t', 'r', 'a', 'i', 'n', 'i', 'n', 'g']\n",
      "    Option #4\n",
      "      Callable with:\n",
      "        Argument #1\n",
      "          DType: dict\n",
      "          Value: {'input_ids': TensorSpec(shape=(None, 5), dtype=tf.int32, name='input_ids')}\n",
      "        Named Argument #1\n",
      "          DType: str\n",
      "          Value: ['t', 'r', 'a', 'i', 'n', 'i', 'n', 'g']\n"
     ]
    }
   ],
   "source": [
    "!saved_model_cli show --all --dir ./model/tensorflow/saved_model/0/"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Show `inference.py`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "# !pygmentize ./model/code/inference.py"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Deploy the Model\n",
    "This will create a default `EndpointConfig` with a single model.  \n",
    "\n",
    "The next notebook will demonstrate how to perform more advanced `EndpointConfig` strategies to support canary rollouts and A/B testing.\n",
    "\n",
    "_Note:  If not using a US-based region, you may need to adapt the container image to your current region using the following table:_\n",
    "\n",
    "https://docs.aws.amazon.com/deep-learning-containers/latest/devguide/deep-learning-containers-images.html"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "tensorflow-training-2020-12-30-07-28-23-054-tf-1609316254\n"
     ]
    }
   ],
   "source": [
    "import time\n",
    "\n",
    "timestamp = int(time.time())\n",
    "\n",
    "tensorflow_model_name = '{}-{}-{}'.format(training_job_name, 'tf', timestamp)\n",
    "\n",
    "print(tensorflow_model_name)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "from sagemaker.tensorflow.model import TensorFlowModel\n",
    "\n",
    "tensorflow_model = TensorFlowModel(name=tensorflow_model_name,\n",
    "                                   model_data='s3://{}/{}/output/model.tar.gz'.format(bucket, training_job_name),\n",
    "                                   role=role,                \n",
    "                                   framework_version='2.1.0')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "tensorflow-training-2020-12-30-07-28-23-054-tf-1609316254\n"
     ]
    }
   ],
   "source": [
    "tensorflow_endpoint_name = '{}-{}-{}'.format(training_job_name, 'tf', timestamp)\n",
    "\n",
    "print(tensorflow_endpoint_name)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "update_endpoint is a no-op in sagemaker>=2.\n",
      "See: https://sagemaker.readthedocs.io/en/stable/v2.html for details.\n"
     ]
    }
   ],
   "source": [
    "tensorflow_model = tensorflow_model.deploy(endpoint_name=tensorflow_endpoint_name,\n",
    "                                           initial_instance_count=1, # Should use >=2 for high(er) availability \n",
    "                                           instance_type='ml.m5.4xlarge', # requires enough disk space for tensorflow, transformers, and bert downloads\n",
    "                                           wait=False)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "arn:aws:sagemaker:us-east-1:835319576252:endpoint/tensorflow-training-2020-12-30-07-28-23-054-tf-1609316254\n"
     ]
    }
   ],
   "source": [
    "tensorflow_endpoint_arn = sm.describe_endpoint(EndpointName=tensorflow_endpoint_name)['EndpointArn']\n",
    "print(tensorflow_endpoint_arn)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<b>Review <a target=\"blank\" href=\"https://console.aws.amazon.com/sagemaker/home?region=us-east-1#/endpoints/tensorflow-training-2020-12-30-07-28-23-054-tf-1609316254\">SageMaker REST Endpoint</a></b>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "from IPython.core.display import display, HTML\n",
    "\n",
    "display(HTML('<b>Review <a target=\"blank\" href=\"https://console.aws.amazon.com/sagemaker/home?region={}#/endpoints/{}\">SageMaker REST Endpoint</a></b>'.format(region, tensorflow_endpoint_name)))\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# _Wait Until the Endpoint is Deployed_"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "CPU times: user 135 ms, sys: 17 ms, total: 152 ms\n",
      "Wall time: 6min 31s\n"
     ]
    }
   ],
   "source": [
    "%%time\n",
    "\n",
    "waiter = sm.get_waiter('endpoint_in_service')\n",
    "waiter.wait(EndpointName=tensorflow_endpoint_name)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# _Wait Until the ^^ Endpoint ^^ is Deployed_"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Show the Experiment Tracking Lineage"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Name/Source</th>\n",
       "      <th>Direction</th>\n",
       "      <th>Type</th>\n",
       "      <th>Association Type</th>\n",
       "      <th>Lineage Type</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>tensorflow-training-2020-12-30-07-28-23-054-tf...</td>\n",
       "      <td>Input</td>\n",
       "      <td>ModelDeployment</td>\n",
       "      <td>AssociatedWith</td>\n",
       "      <td>action</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                                         Name/Source Direction  \\\n",
       "0  tensorflow-training-2020-12-30-07-28-23-054-tf...     Input   \n",
       "\n",
       "              Type Association Type Lineage Type  \n",
       "0  ModelDeployment   AssociatedWith       action  "
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from sagemaker.lineage.visualizer import LineageTableVisualizer\n",
    "\n",
    "lineage_table_viz = LineageTableVisualizer(sess)\n",
    "lineage_table_viz_df = lineage_table_viz.show(endpoint_arn=tensorflow_endpoint_arn)\n",
    "lineage_table_viz_df"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Test the Deployed Model\n",
    "\n",
    "TODO:  Create a custom serializer here:  https://github.com/aws/sagemaker-python-sdk/blob/master/src/sagemaker/serializers.py"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {},
   "outputs": [],
   "source": [
    "# class RequestHandler(object):\n",
    "#   import json\n",
    "  \n",
    "#   def __init__(self, tokenizer, max_seq_length):\n",
    "#     self.tokenizer = tokenizer\n",
    "#     self.max_seq_length = max_seq_length\n",
    "\n",
    "#   def __call__(self, instances):\n",
    "#     transformed_instances = []\n",
    "\n",
    "#     for instance in instances:\n",
    "#       encode_plus_tokens = tokenizer.encode_plus(instance,\n",
    "#                             pad_to_max_length=True,\n",
    "#                             max_length=self.max_seq_length)\n",
    "\n",
    "#       input_ids = encode_plus_tokens['input_ids']\n",
    "#       input_mask = encode_plus_tokens['attention_mask']\n",
    "#       segment_ids = [0] * self.max_seq_length\n",
    "\n",
    "#       transformed_instance = {\"input_ids\": input_ids, \n",
    "#                   \"input_mask\": input_mask, \n",
    "#                   \"segment_ids\": segment_ids}\n",
    "\n",
    "#       transformed_instances.append(transformed_instance)\n",
    "\n",
    "#     transformed_data = {\"instances\": transformed_instances}\n",
    "\n",
    "#     return json.dumps(transformed_data)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import sagemaker.serializer.SimpleBaseSerializer\n",
    "\n",
    "class DistilBertSerializer(SimpleBaseSerializer):\n",
    "    \"\"\"Serialize data to DistilBERT format.\"\"\"\n",
    "\n",
    "    def serialize(self, data):\n",
    "        \"\"\"Serialize data of various formats to a JSON formatted string.\n",
    "        Args:\n",
    "            data (object): Data to be serialized.\n",
    "        Returns:\n",
    "            str: The data serialized as a JSON string.\n",
    "        \"\"\"\n",
    "        if isinstance(data, dict):\n",
    "            return json.dumps(\n",
    "                {\n",
    "                    key: value.tolist() if isinstance(value, np.ndarray) else value\n",
    "                    for key, value in data.items()\n",
    "                }\n",
    "            )\n",
    "\n",
    "        if hasattr(data, \"read\"):\n",
    "            return data.read()\n",
    "\n",
    "        if isinstance(data, np.ndarray):\n",
    "            return json.dumps(data.tolist())\n",
    "\n",
    "        return json.dumps(data)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "metadata": {},
   "outputs": [],
   "source": [
    "# class ResponseHandler(object):\n",
    "#   import json\n",
    "#   import tensorflow as tf\n",
    "    \n",
    "#   def __init__(self, classes):\n",
    "#     self.classes = classes\n",
    "    \n",
    "  \n",
    "#   def __call__(self, response, accept_header):\n",
    "#     import tensorflow as tf\n",
    "\n",
    "#     response_body = response.read().decode('utf-8')\n",
    "\n",
    "#     response_json = json.loads(response_body)\n",
    "\n",
    "#     log_probabilities = response_json[\"predictions\"]\n",
    "\n",
    "#     predicted_classes = []\n",
    "\n",
    "#     # Convert log_probabilities => softmax (all probabilities add up to 1) => argmax (final prediction)\n",
    "#     for log_probability in log_probabilities:\n",
    "#       softmax = tf.nn.softmax(log_probability)  \n",
    "#       predicted_class_idx = tf.argmax(softmax, axis=-1, output_type=tf.int32)\n",
    "#       predicted_class = self.classes[predicted_class_idx]\n",
    "#       predicted_classes.append(predicted_class)\n",
    "\n",
    "#     return predicted_classes\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import sagemaker.serializer.SimpleBaseDeserializer\n",
    "\n",
    "class DistilBertDeserializer(SimpleBaseDeserializer):\n",
    "    \"\"\"Deserialize DistilBert data from an inference endpoint into a Python object.\"\"\"\n",
    "\n",
    "    def __init__(self, accept=\"application/json\"):\n",
    "        \"\"\"Initialize a ``JSONDeserializer`` instance.\n",
    "        Args:\n",
    "            accept (union[str, tuple[str]]): The MIME type (or tuple of allowable MIME types) that\n",
    "                is expected from the inference endpoint (default: \"application/json\").\n",
    "        \"\"\"\n",
    "        super(JSONDeserializer, self).__init__(accept=accept)\n",
    "\n",
    "    def deserialize(self, stream, content_type):\n",
    "        \"\"\"Deserialize JSON data from an inference endpoint into a Python object.\n",
    "        Args:\n",
    "            stream (botocore.response.StreamingBody): Data to be deserialized.\n",
    "            content_type (str): The MIME type of the data.\n",
    "        Returns:\n",
    "            object: The JSON-formatted data deserialized into a Python object.\n",
    "        \"\"\"\n",
    "        try:\n",
    "            return json.load(codecs.getreader(\"utf-8\")(stream))\n",
    "        finally:\n",
    "            stream.close()\n",
    "            "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 44,
   "metadata": {},
   "outputs": [],
   "source": [
    "import json\n",
    "from sagemaker.tensorflow.model import TensorFlowPredictor\n",
    "from transformers import DistilBertTokenizer\n",
    "\n",
    "tokenizer = DistilBertTokenizer.from_pretrained('distilbert-base-uncased')\n",
    "\n",
    "request_handler = RequestHandler(tokenizer=tokenizer,\n",
    "                                 max_seq_length=64)\n",
    "\n",
    "response_handler = ResponseHandler(classes=[1, 2, 3, 4, 5])\n",
    "\n",
    "predictor = TensorFlowPredictor(endpoint_name=tensorflow_endpoint_name,\n",
    "                                sagemaker_session=sess,\n",
    "                                serializer=request_handler,\n",
    "                                deserializer=response_handler,\n",
    "#                                content_type='application/json',\n",
    "                                model_name='saved_model',\n",
    "                                model_version=0)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": 45,
   "metadata": {},
   "outputs": [],
   "source": [
    "# import json\n",
    "# from sagemaker.tensorflow.model import TensorFlowPredictor\n",
    "\n",
    "# predictor = TensorFlowPredictor(endpoint_name=tensorflow_endpoint_name,\n",
    "#                                 sagemaker_session=sess,\n",
    "#                                 model_name='saved_model',\n",
    "#                                 model_version=0)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Waiting for the Endpoint to be ready to Serve Predictions"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 46,
   "metadata": {},
   "outputs": [],
   "source": [
    "# import time\n",
    "\n",
    "# time.sleep(30)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Predict the `star_rating` with Ad Hoc `review_body` Samples"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 47,
   "metadata": {},
   "outputs": [
    {
     "ename": "AttributeError",
     "evalue": "'RequestHandler' object has no attribute 'serialize'",
     "output_type": "error",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[0;31mAttributeError\u001b[0m                            Traceback (most recent call last)",
      "\u001b[0;32m<ipython-input-47-ac1f87bb6df3>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[1;32m      1\u001b[0m \u001b[0mreviews\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m[\u001b[0m\u001b[0;34m\"This is great!\"\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m      2\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 3\u001b[0;31m \u001b[0mpredicted_classes\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mpredictor\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mpredict\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mreviews\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m      4\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m      5\u001b[0m \u001b[0;32mfor\u001b[0m \u001b[0mpredicted_class\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mreview\u001b[0m \u001b[0;32min\u001b[0m \u001b[0mzip\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mpredicted_classes\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mreviews\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;32m~/anaconda3/envs/python3/lib/python3.6/site-packages/sagemaker/tensorflow/model.py\u001b[0m in \u001b[0;36mpredict\u001b[0;34m(self, data, initial_args)\u001b[0m\n\u001b[1;32m    104\u001b[0m                 \u001b[0margs\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m\"CustomAttributes\"\u001b[0m\u001b[0;34m]\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_model_attributes\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    105\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 106\u001b[0;31m         \u001b[0;32mreturn\u001b[0m \u001b[0msuper\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mTensorFlowPredictor\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mpredict\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mdata\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0margs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m    107\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    108\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;32m~/anaconda3/envs/python3/lib/python3.6/site-packages/sagemaker/predictor.py\u001b[0m in \u001b[0;36mpredict\u001b[0;34m(self, data, initial_args, target_model, target_variant)\u001b[0m\n\u001b[1;32m    122\u001b[0m         \"\"\"\n\u001b[1;32m    123\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 124\u001b[0;31m         \u001b[0mrequest_args\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_create_request_args\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mdata\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0minitial_args\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mtarget_model\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mtarget_variant\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m    125\u001b[0m         \u001b[0mresponse\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0msagemaker_session\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0msagemaker_runtime_client\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0minvoke_endpoint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m**\u001b[0m\u001b[0mrequest_args\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    126\u001b[0m         \u001b[0;32mreturn\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_handle_response\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mresponse\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;32m~/anaconda3/envs/python3/lib/python3.6/site-packages/sagemaker/predictor.py\u001b[0m in \u001b[0;36m_create_request_args\u001b[0;34m(self, data, initial_args, target_model, target_variant)\u001b[0m\n\u001b[1;32m    151\u001b[0m             \u001b[0margs\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m\"TargetVariant\"\u001b[0m\u001b[0;34m]\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mtarget_variant\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    152\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 153\u001b[0;31m         \u001b[0mdata\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mserializer\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mserialize\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mdata\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m    154\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    155\u001b[0m         \u001b[0margs\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m\"Body\"\u001b[0m\u001b[0;34m]\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mdata\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;31mAttributeError\u001b[0m: 'RequestHandler' object has no attribute 'serialize'"
     ]
    }
   ],
   "source": [
    "reviews = [\"This is great!\"]\n",
    "\n",
    "predicted_classes = predictor.predict(reviews)\n",
    "\n",
    "for predicted_class, review in zip(predicted_classes, reviews):\n",
    "    print('[Predicted Star Rating: {}]'.format(predicted_class), review)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Predict the `star_rating` with `review_body` Samples from our TSV's"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import csv\n",
    "\n",
    "df_reviews = pd.read_csv('./data/amazon_reviews_us_Digital_Software_v1_00.tsv.gz', \n",
    "                         delimiter='\\t', \n",
    "                         quoting=csv.QUOTE_NONE,\n",
    "                         compression='gzip')\n",
    "df_sample_reviews = df_reviews[['review_body', 'star_rating']].sample(n=100)\n",
    "df_sample_reviews = df_sample_reviews.reset_index()\n",
    "df_sample_reviews.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "\n",
    "def predict(review_body):\n",
    "    return predictor.predict(review_body)\n",
    "\n",
    "df_sample_reviews['predicted_class'] = df_sample_reviews['review_body'].map(predict)\n",
    "df_sample_reviews.head(5)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Save for Next Notebook(s)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%store tensorflow_model_name"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%store tensorflow_endpoint_name"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%store tensorflow_endpoint_arn"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%store"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Delete Endpoint\n",
    "To save cost, we should delete the endpoint."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# sm.delete_endpoint(\n",
    "#      EndpointName=tensorflow_endpoint_name\n",
    "# )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%javascript\n",
    "Jupyter.notebook.save_checkpoint();\n",
    "Jupyter.notebook.session.delete();"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "conda_python3",
   "language": "python",
   "name": "conda_python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
